Skip to content

OPENMP and __GNUC__ compiler safety and performance #19025

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jenshannoschwalm
Copy link
Collaborator

@jenshannoschwalm jenshannoschwalm added this to the 5.4 milestone Jun 29, 2025
@jenshannoschwalm jenshannoschwalm added the scope: codebase making darktable source code easier to manage label Jun 29, 2025
@kmilos
Copy link
Contributor

kmilos commented Jun 29, 2025

Sure, but I still prefer reverting commit 9d7ab0e at the same time.

Also, this seems to leave macOS/AppleClang as the only other (major/supported) environment w/ these pragmas on math functions... Are we sure they're needed there at all? Can't we just simply remove them? @zisoft

@jenshannoschwalm jenshannoschwalm force-pushed the compile_simd_optimizing branch from b0de442 to f14aa41 Compare June 29, 2025 15:30
@jenshannoschwalm
Copy link
Collaborator Author

Sure, but I still prefer reverting

Latest squashed and force-pushed commit includes the revert.

Are we sure they're needed there at all?

I would even say they are very likely not wanted and possibly harmful.

The whole story reminds me of

DT_OMP_DECLARE_SIMD()
static inline float sqf(const float x)

in rcd demosaicing we compile with some extra flags

  #pragma GCC optimize ("fast-math", "fp-contract=fast", "finite-math-only", "no-math-errno")

definitely improving performance but using above sqf() definition lead to artifacts.

@ralfbrown
Copy link
Collaborator

Performance-wise, the biggest impact comes from -ffast-math, and that's almost certainly the cause of artifacts. With that option enabled, the compiler is free to replace divisions by multiplications with the reciprocal, for which it uses a fast-but-approximate reciprocal instruction with the result that the division only has 12 bts of precision instead of 24....

@jenshannoschwalm
Copy link
Collaborator Author

Yes, so avoided divs there.

Yet, the specific problem there was related to the simd sqf() function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scope: codebase making darktable source code easier to manage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants